智能论文笔记

A Reliable and Low Latency Synchronizing Middleware for Co-simulation of a Heterogeneous Multi-Robot Systems

Emon Dey , Mikolaj Walczak , Mohammad Saeid Anwar , Nirmalya Roy

分类：机器人

2022-11-10

Search and rescue, wildfire monitoring, and flood/hurricane impact assessment are mission-critical services for recent IoT networks. Communication synchronization, dependability, and minimal communication jitter are major simulation and system issues for the time-based physics-based ROS simulator, event-based network-based wireless simulator, and complex dynamics of mobile and heterogeneous IoT devices deployed in actual environments. Simulating a heterogeneous multi-robot system before deployment is difficult due to synchronizing physics (robotics) and network simulators. Due to its master-based architecture, most TCP/IP-based synchronization middlewares use ROS1. A real-time ROS2 architecture with masterless packet discovery synchronizes robotics and wireless network simulations. A velocity-aware Transmission Control Protocol (TCP) technique for ground and aerial robots using Data Distribution Service (DDS) publish-subscribe transport minimizes packet loss, synchronization, transmission, and communication jitters. Gazebo and NS-3 simulate and test. Simulator-agnostic middleware. LOS/NLOS and TCP/UDP protocols tested our ROS2-based synchronization middleware for packet loss probability and average latency. A thorough ablation research replaced NS-3 with EMANE, a real-time wireless network simulator, and masterless ROS2 with master-based ROS1. Finally, we tested network synchronization and jitter using one aerial drone (Duckiedrone) and two ground vehicles (TurtleBot3 Burger) on different terrains in masterless (ROS2) and master-enabled (ROS1) clusters. Our middleware shows that a large-scale IoT infrastructure with a diverse set of stationary and robotic devices can achieve low-latency communications (12% and 11% reduction in simulation and real) while meeting mission-critical application reliability (10% and 15% packet loss reduction) and high-fidelity requirements.

translated by 谷歌翻译

Improving Narrative Relationship Embeddings by Training with Additional Inverse-Relationship Constraints

Mikolaj Figurski

分类：自然语言处理 | 机器学习

2022-12-21

We consider the problem of embedding character-entity relationships from the reduced semantic space of narratives, proposing and evaluating the assumption that these relationships hold under a reflection operation. We analyze this assumption and compare the approach to a baseline state-of-the-art model with a unique evaluation that simulates efficacy on a downstream clustering task with human-created labels. Although our model creates clusters that achieve Silhouette scores of -.084, outperforming the baseline -.227, our analysis reveals that the models approach the task much differently and perform well on very different examples. We conclude that our assumption might be useful for specific types of data and should be evaluated on a wider range of tasks.

translated by 谷歌翻译

Self-Supervised Learning for Speech Enhancement through Synthesis

Bryce Irvin , Marko Stamenovic , Mikolaj Kegler , Li-Chia Yang

分类：自然语言处理 | 机器学习

2022-11-04

Modern speech enhancement (SE) networks typically implement noise suppression through time-frequency masking, latent representation masking, or discriminative signal prediction. In contrast, some recent works explore SE via generative speech synthesis, where the system's output is synthesized by a neural vocoder after an inherently lossy feature-denoising step. In this paper, we propose a denoising vocoder (DeVo) approach, where a vocoder accepts noisy representations and learns to directly synthesize clean speech. We leverage rich representations from self-supervised learning (SSL) speech models to discover relevant features. We conduct a candidate search across 15 potential SSL front-ends and subsequently train our vocoder adversarially with the best SSL configuration. Additionally, we demonstrate a causal version capable of running on streaming audio with 10ms latency and minimal performance degradation. Finally, we conduct both objective evaluations and subjective listening studies to show our system improves objective metrics and outperforms an existing state-of-the-art SE model subjectively.

translated by 谷歌翻译

Generalisability of deep learning models in low-resource imaging settings: A fetal ultrasound study in 5 African countries

Carla Sendra-Balcells , Víctor M. Campello , Jordina Torrents-Barrena , Yahya Ali Ahmed , Mustafa Elattar , Benard Ohene Botwe , Pempho Nyangulu , William Stones , Mohammed Ammar , Lamya Nawal Benamer

分类：计算机视觉

2022-09-20

大多数人工智能（AI）研究都集中在高收入国家，其中成像数据，IT基础设施和临床专业知识丰富。但是，在需要医学成像的有限资源环境中取得了较慢的进步。例如，在撒哈拉以南非洲，由于获得产前筛查的机会有限，围产期死亡率的率很高。在这些国家，可以实施AI模型，以帮助临床医生获得胎儿超声平面以诊断胎儿异常。到目前为止，已经提出了深度学习模型来识别标准的胎儿平面，但是没有证据表明它们能够概括获得高端超声设备和数据的中心。这项工作研究了不同的策略，以减少在高资源临床中心训练并转移到新的低资源中心的胎儿平面分类模型的域转移效果。为此，首先在丹麦的一个新中心对1,008例患者的新中心进行评估，接受了1,008名患者的新中心，后来对五个非洲中心（埃及，阿尔及利亚，乌干达，加纳和马拉维进行了相同的表现），首先在丹麦的一个新中心进行评估。）每个患者有25名。结果表明，转移学习方法可以是将小型非洲样本与发达国家现有的大规模数据库相结合的解决方案。特别是，该模型可以通过将召回率提高到0.92 \ pm 0.04 $，同时又可以维持高精度。该框架显示了在临床中心构建可概括的新AI模型的希望，该模型在具有挑战性和异质条件下获得的数据有限，并呼吁进行进一步的研究，以开发用于资源较少的国家 /地区的AI可用性的新解决方案。

translated by 谷歌翻译

BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping

Gasser Elbanna , Neil Scheidwasser-Clow , Mikolaj Kegler , Pierre Beckmann , Karl El Hajal , Milos Cernak

分类：人工智能 | 机器学习

2022-06-24

自从几十年前的频谱分析开创性工作以来，已经研究了提取音频和语音特征的方法。最近的努力以开发通用音频表示的雄心为指导。例如，如果深度神经网络在大型音频数据集上进行了培训，则可以提取最佳的嵌入。这项工作扩展了基于自我监督的学习，通过引导，提出各种编码器体系结构，并探索使用不同的预训练数据集的效果。最后，我们提出了一个新颖的培训框架，以提出一个混合音频表示，该框架结合了手工制作和数据驱动的学习音频功能。在HEAR NEURIPS 2021挑战中，对听觉场景分类和时间戳检测任务进行了评估。我们的结果表明，在大多数听到挑战任务中，带有卷积变压器的混合模型都会产生卓越的性能。

translated by 谷歌翻译

DeepJSCC-Q: Constellation Constrained Deep Joint Source-Channel Coding

Tze-Yang Tung , David Burth Kurka , Mikolaj Jankowski , Deniz Gunduz

分类：机器学习

2022-06-16

最近的作品表明，现代机器学习技术可以为长期存在的联合源通道编码（JSCC）问题提供另一种方法。非常有希望的初始结果，优于使用单独的源代码和通道代码的流行数字方案，已被证明用于使用深神经网络（DNNS）的无线图像和视频传输。但是，此类方案的端到端培训需要可区分的通道输入表示。因此，先前的工作假设可以通过通道传输任何复杂值。这可以防止在硬件或协议只能接收数字星座规定的某些频道输入集的情况下应用这些代码。本文中，我们建议使用有限通道输入字母的端到端优化的JSCC解决方案DeepJSCC-Q。我们表明，DEEPJSCC-Q可以实现与允许任何复杂的有价值通道输入的先前作品相似的性能，尤其是在可用的高调制订单时，并且在调制顺序增加的情况下，性能渐近接近无约束通道输入的情况。重要的是，DEEPJSCC-Q保留了不可预测的渠道条件下图像质量的优雅降级，这是在频道迅速变化的移动系统中部署的理想属性。

translated by 谷歌翻译

Machine Learning based Framework for Robust Price-Sensitivity Estimation with Application to Airline Pricing

Ravi Kumar , Shahin Boluki , Karl Isler , Jonas Rauch , Darius Walczak

分类： (统计)机器学习 | 机器学习

2022-05-04

We consider the problem of dynamic pricing of a product in the presence of feature-dependent price sensitivity. Developing practical algorithms that can estimate price elasticities robustly, especially when information about no purchases (losses) is not available, to drive such automated pricing systems is a challenge faced by many industries. Based on the Poisson semi-parametric approach, we construct a flexible yet interpretable demand model where the price related part is parametric while the remaining (nuisance) part of the model is non-parametric and can be modeled via sophisticated machine learning (ML) techniques. The estimation of price-sensitivity parameters of this model via direct one-stage regression techniques may lead to biased estimates due to regularization. To address this concern, we propose a two-stage estimation methodology which makes the estimation of the price-sensitivity parameters robust to biases in the estimators of the nuisance parameters of the model. In the first-stage we construct estimators of observed purchases and prices given the feature vector using sophisticated ML estimators such as deep neural networks. Utilizing the estimators from the first-stage, in the second-stage we leverage a Bayesian dynamic generalized linear model to estimate the price-sensitivity parameters. We test the performance of the proposed estimation schemes on simulated and real sales transaction data from the Airline industry. Our numerical studies demonstrate that our proposed two-stage approach reduces the estimation error in price-sensitivity parameters from 25\% to 4\% in realistic simulation settings. The two-stage estimation techniques proposed in this work allows practitioners to leverage modern ML techniques to robustly estimate price-sensitivities while still maintaining interpretability and allowing ease of validation of its various constituent parts.

translated by 谷歌翻译

Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load

Gasser Elbanna , Alice Biryukov , Neil Scheidwasser-Clow , Lara Orlandic , Pablo Mainar , Mikolaj Kegler , Pierre Beckmann , Milos Cernak

分类：人工智能 | 机器学习

2022-03-30

作为对威胁或不利条件的神经生理学反应，压力会影响认知，情绪和行为，并在持续暴露的情况下对健康产生有害的影响。由于语音的情感内容固有地由个人的身心状态调节，因此大量的研究专门研究了引起压力的任务负荷的副语言相关性。从历史上看，语音应力分析（VSA）是使用常规数字信号处理（DSP）技术进行的。尽管基于深神网络（DNN）的现代方法发展了现代方法，但由于多种压力源和个体压力感知的差异，准确检测语音压力仍然很困难。为此，我们介绍了一组五个数据集，用于语音中的任务负载检测。在志愿者队列中诱发了认知或身体压力，累积数量超过一百位讲话者，因此收集了声音记录。我们使用数据集设计和评估了一种新型的自我监督音频表示，该音频表示利用了手工制作的功能（基于DSP）的有效性和数据驱动的DNN表示的复杂性。值得注意的是，所提出的方法的表现优于广泛的手工特征集和新型的基于DNN的音频表示方法。

translated by 谷歌翻译

Step-unrolled Denoising Autoencoders for Text Generation

Nikolay Savinov , Junyoung Chung , Mikolaj Binkowski , Erich Elsen , Aaron van den Oord

分类：自然语言处理 | 机器学习

2021-12-13

在本文中，我们提出了一种新的生成模型，逐步逐步的去噪AutoEncoder（Sundae），不依赖于自回归模型。类似地与去噪扩散技术，在从随机输入开始并从随机输入开始并每次直到收敛改善它们时，日出施加Sundae。我们提出了一个简单的新改进运算符，它比扩散方法更少迭代，同时在定性地在自然语言数据集上产生更好的样本。Sundae在WMT'14英语到德语翻译任务上实现最先进的结果（非自回归方法），在巨大清洁的常见爬网数据集和Python代码的数据集上对无条件语言建模的良好定性结果来自GitHub。通过在模板中填充任意空白模式，Sundae的非自动增加性质开辟了超出左右提示的可能性。

translated by 谷歌翻译

Neural Weight Step Video Compression

Mikolaj Czerkawski , Javier Cardona , Robert Atkinson , Craig Michie , Ivan Andonovic , Carmine Clemente , Christos Tachtatzis

分类：计算机视觉

2021-12-02

最近已经提出了基于编码图像作为神经网络的重量的各种压缩方法。然而，视频压缩类似方法的潜力仍然是未开发的。在这项工作中，我们建议使用两个架构范式，基于坐标的MLP（CBMLP）和卷积网络来测试压缩视频的可行性的一组实验。此外，我们提出了一种新颖的神经重量踩踏技术，其中视频的后续帧被编码为低熵参数更新。为了评估所考虑的方法的可行性，我们将在几个高分辨率视频数据集上测试视频压缩性能，并与现有的传统和神经压缩技术进行比较。

translated by 谷歌翻译